智能论文笔记

AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020

Jin Xu , Mingjian Chen , Jianqiang Huang , Xingyuan Tang , Ke Hu , Jian Li , Jia Cheng , Jun Lei

分类：机器学习 | 人工智能

2021-11-25

图形神经网络（GNNS）已经变得越来越流行，并且在许多基于图形的应用程序中实现了令人印象深刻的结果。但是，需要广泛的手动工作和域知识来设计有效的架构，GNN模型的结果具有高差异，与不同的培训设置相比，限制了现有GNN模型的应用。在本文中，我们展示了AutoHensgnn，这是一个框架，用于为图表任务构建有效和强大的模型而没有任何人为干预。 Autohensgnn在kdd杯2020年签名挑战中赢得了第一名，并在最终阶段实现了五个现实生活数据集的最佳等级分数。鉴于任务，AutoHensgnn首先应用一个快速的代理评估，以自动选择有希望的GNN模型的池。然后它构建了一个分层合奏框架：1）我们提出图形自我合奏（GSE），这可以减少重量初始化的方差，有效利用本地和全球街区的信息; 2）基于GSE，使用不同类型的GNN模型的加权集合来有效地学习更多辨别节点表示。为了有效地搜索体系结构和合奏权重，我们提出了AutoHensgnn $ _ {\ text {梯度}} $，它将架构和集合权重视为架构参数，并使用基于梯度的架构搜索来获得最佳配置，而autohensgnn $ {autohensgnn $ { \文本{Adaptive}} $，可以根据模型精度自适应地调整集合重量。关于节点分类的广泛实验，图形分类，边缘预测和KDD杯挑战表明了Autohensgnn的有效性和一般性

translated by 谷歌翻译

Robust Information Bottleneck for Task-Oriented Communication with Digital Modulation

Songjie Xie , Youlong Wu , Shuai Ma , Ming Ding , Yuanming Shi , Mingjian Tang

分类：机器学习

2022-09-21

以任务为导向的通信，主要是使用基于学习的联合源通道编码（JSCC），旨在通过将与任务相关的信息传输到接收方来设计通信有效的边缘推理系统。但是，只有在不引入任何冗余的情况下传输与任务相关的信息可能会导致由于渠道变化引起的学习鲁棒性问题，而JSCC将源数据直接映射到连续的通道输入符号中会对现有数字通信系统提出兼容性问题。在本文中，我们通过首先调查编码表示形式的信息性与接收到的信息失真的鲁棒性之间的固有权衡解决这两个问题，然后提出一种具有任务调制的导向的通信方案，名为Inveete Task-定向的JSCC（DT-JSCC），其中发射器将功能编码为离散表示形式，并使用数字调制方案将其传输到接收器。在DT-JSCC方案中，我们开发了一个可靠的编码框架，称为强大的信息瓶颈（rib），以改善对信道变化的稳健性，并使用变量近似来得出肋骨目标的可拖动变异上限，以克服克服相互信息的计算棘手性。实验结果表明，所提出的DT-JSCC比具有低通信延迟的基线方法更好的推理性能更好，并且由于施加的肋骨框架而表现出对通道变化的鲁棒性。

translated by 谷歌翻译

Action-conditioned On-demand Motion Generation

Qiujing Lu , Yipeng Zhang , Mingjian Lu , Vwani Roychowdhury

分类：计算机视觉

2022-07-17

我们提出了一个新颖的框架，按需运动产生（ODMO），用于生成现实和多样化的长期3D人体运动序列，该序列仅以具有额外的自定义能力的动作类型为条件。 ODMO在三个公共数据集（HumanAct12，UESTC和MOCAP）上进行评估时，对所有传统运动评估指标的SOTA方法显示了改进。此外，我们提供定性评估和定量指标，这些指标证明了我们框架提供的几种首要的自定义功能，包括模式发现，插值和轨迹自定义。这些功能大大扩大了此类运动产生模型的潜在应用的范围。编码器和解码器体系结构中的创新启用了新颖的按需生成能力：（i）编码器：在低维的潜在空间中利用对比度学习来创建运动序列的层次结构嵌入，不仅是不同动作的代码，类型形成不同的组，但在动作类型中，类似的固有模式（运动样式）聚集在一起的代码，使它们容易发现；（ii）解码器：使用层次解码策略，该策略首先重建运动轨迹，然后用于重建整个运动序列。这样的架构可以有效地控制轨迹控制。我们的代码发布在GitHub页面：https：//github.com/roychowdhuryresearch/odmo

translated by 谷歌翻译

Wholesale Electricity Price Forecasting using Integrated Long-term Recurrent Convolutional Network Model

Vasudharini Sridharan , Mingjian Tuo , Xingpeng Li

分类：机器学习

2021-12-23

电价是影响所有市场参与者决策的关键因素。准确的电价预测非常重要，并且由于各种因素，电价高度挥发性，电价也非常具有挑战性。本文提出了一项综合的长期经常性卷积网络（ILRCN）模型，以预测考虑到市场价格的大多数贡献属性的电力价格。所提出的ILRCN模型将卷积神经网络和长短期记忆（LSTM）算法的功能与所提出的新颖的条件纠错项相结合。组合的ILRCN模型可以识别输入数据内的线性和非线性行为。我们使用鄂尔顿批发市场价格数据以及负载型材，温度和其他因素来说明所提出的模型。使用平均绝对误差和准确性等性能/评估度量来验证所提出的ILRCN电价预测模型的性能。案例研究表明，与支持向量机（SVM）模型，完全连接的神经网络模型，LSTM模型和LRCN模型，所提出的ILRCN模型在电价预测中是准确和有效的电力价格预测。

translated by 谷歌翻译

Dynamic Resolution Network

Mingjian Zhu , Kai Han , Enhua Wu , Qiulin Zhang , Ying Nie , Zhenzhong Lan , Yunhe Wang

分类：计算机视觉

2021-06-05

深度卷积神经网络（CNNS）通常是复杂的设计，具有许多可学习的参数，用于准确性原因。为了缓解在移动设备上部署它们的昂贵成本，最近的作品使挖掘预定识别架构中的冗余作出了巨大努力。然而，尚未完全研究现代CNN的输入分辨率的冗余，即输入图像的分辨率是固定的。在本文中，我们观察到，用于准确预测给定图像的最小分辨率使用相同的神经网络是不同的。为此，我们提出了一种新颖的动态分辨率网络（DRNET），其中基于每个输入样本动态地确定输入分辨率。其中，利用所需网络共同地探索具有可忽略的计算成本的分辨率预测器。具体地，预测器学习可以保留的最小分辨率，并且甚至超过每个图像的原始识别准确性。在推断过程中，每个输入图像将被调整为其预测的分辨率，以最小化整体计算负担。然后，我们对几个基准网络和数据集进行了广泛的实验。结果表明，我们的DRNET可以嵌入到任何现成的网络架构中，以获得计算复杂性的相当大降低。例如，DR-RESET-50实现了类似的性能，计算减少约34％，同时增加了1.4％的准确度，与原始Resnet-50上的计算减少相比，在ImageNet上的原始resnet-50增加了10％。

translated by 谷歌翻译

Generative appearance replay for continual unsupervised domain adaptation

Boqi Chen , Kevin Thandiackal , Pushpak Pati , Orcun Goksel

分类：计算机视觉 | 人工智能

2023-01-03

Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

Explaining Imitation Learning through Frames

Boyuan Zheng , Jianlong Zhou , Chunjie Liu , Yiqiao Li , Fang Chen

分类：机器学习 | 计算机视觉

2023-01-03

As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译

Risk-Averse MDPs under Reward Ambiguity

Haolin Ruan , Zhi Chen , Chin Pang Ho

分类：机器学习

2023-01-03

We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.

translated by 谷歌翻译